Recent advances in transcribing television and radio broadcasts

نویسندگان

  • Jean-Luc Gauvain
  • Lori Lamel
  • Gilles Adda
  • Michèle Jardino
چکیده

Transcription of broadcast news shows (radio and television) is a major step in developing automatic tools for indexation and retrieval of the vast amounts of information generated on a daily basis. Broadcast shows are challenging to transcribe as they consist of a continuous data stream with segments of different linguistic and acoustic natures. Transcribing such data requires addressing two main problems: those related to the varied acoustic properties of the signal, and those related to the linguistic properties of the speech. Prior to word transcription, the data is partitioned into homogeneous acoustic segments. Non-speech segments are identified and rejected, and the speech segments are clustered and labeled according to bandwidth and gender. The speaker-independent large vocabulary, continuous speech recognizer makes use of n-gram statistics for language modeling and of continuous density HMMs with Gaussian mixtures for acoustic modeling. The LIMSI system has consistently obtained top-level performance in DARPA evaluations, with an overall word transcription error on the Nov98 evaluation test data of 13.6%. The average word error on unrestricted American English broadcast news data is under 20%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Survey on Media Literacy of Radio and Television Program Producers (Case study: Mazandaran Radio and Television Center)

In the recent decades, advanced technologies in  production, infrastructure and devices for providing content and services have been provided. The audience can use the media of their interest at any time, with the desired devices .Accordingly, there are many differences in the level of media literacy of the people. This study seeks to determine the level of media literacy of the producer of pro...

متن کامل

Building a Model of effect of radio & television on the learning of science and technology based on radio & television managers and communication experts perspectives)

This study is determinded to build a model for improving the effectiveness of national media impact on the promotion of science and technology. This by reviewing, there is researched  at the first, theoretical studies of scientific and technological culture in the world, then compared benchmarking between Iran and the country's position  The questionnaire was designed based on global studies an...

متن کامل

Transcription of broadcast news

In this paper we report on our recent work in transcribing broadcast news shows. Radio and television broadcasts contain signal segments of various linguistic and acoustic natures. The shows contain both prepared and spontaneous speech. The signal may be studio quality or have been transmitted over a telephone or other noisy channel (ie., corrupted by additive noise and nonlinear distorsions), ...

متن کامل

Mapping of Agricultural Information Flows for Yam Minisett Technology in Delta State, Nigeria

ABSTRACTThis study examined information flow on minisett technology among yam farmers in Delta State, Nigeria. A sample size of 180 respondents was involved in the study. Data were obtained from respondents of the study through the use of a validated interview schedule. Percentage, frequency count and mean scores were used to summarize data, while line diagrams were used to develop maps of info...

متن کامل

Partitioning and transcription of broadcast news data

Radio and television broadcasts consist of a continuous stream of data comprised of segments of different linguistic and acoustic natures, which poses challenges for transcription. In this paper we report on our recent work in transcribing broadcast news data[2, 4], including the problem of partitioning the data into homogeneous segments prior to word recognition. Gaussian mixture models are us...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999